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Abstract 

The  effect  of  government  programs  on  the  distribution  of  participants'  earnings  is 
important  for  program  evaluation  and  welfare  comparisons.  This  paper  reports  es- 
timates of  the  effects  of  JTPA  training  programs  on  the  distribution  of  earnings. 
The  estimation  uses  a  new  instrumental  variable  (IV)  method  that  measures  pro- 
gram impacts  on  the  quantiles  of  outcome  variables.  This  quantile  treatment  effects 
(QTE)  estimator  accommodates  exogenous  covariates  and  reduces  to  quantile  regres- 
sion when  selection  for  treatment  is  exogenously  determined.  The  QTE  estimator  can 
be  computed  as  the  solution  to  a  convex  linear  programming  problem,  although  this 
requires  first-step  estimation  of  a  nuisance  function.  We  develop  distribution  theory 
for  the  case  where  the  first  step  is  estimated  nonparametrically.  For  women,  the 
empirical  results  show  that  the  JTPA  program  had  the  largest  proportional  impact 
at  low  quantiles.  Perhaps  surprisingly,  however,  JTPA  training  raised  the  quantiles 
of  earnings  for  men  only  in  the  upper  half  of  the  trainee  earnings  distribution. 
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1.    Introduction 

Effects  of  economic  variables  on  distributions  of  outcomes  are  of  fundamental  interest  in 
many  areas  of  empirical  economic  research.  A  leading  example  is  the  question  of  how 
government  programs  affect  the  distribution  of  participants'  earnings,  since  the  welfare 
analysis  of  public  policies  involves  distributions  of  outcomes.  Policy-makers  often  hope  that 
subsidized  training  programs  will  reduce  earnings  inequality  by  raising  the  lower  quantiles 
of  the  earnings  distribution  and  thereby  reducing  poverty  (Lalonde  (1995),  US  Department 
of  Labor  (1995)).  Another  example  from  labor  economics  is  the  effect  of  union  status  on  the 
distribution  of  earnings.  One  of  the  earliest  studies  of  the  distributional  consequences  of 
unionism  is  Freeman  (1980),  while  more  recent  analyses  include  Card  (1996),  and  DiNardo, 
Fortin,  and  Lemieux  (1996),  who  have  asked  whether  changes  in  union  status  can  account 
for  a  significant  fraction  of  increasing  wage  inequality  in  the  1980s. 

Although  the  importance  of  distribution  effects  is  widely  acknowledged,  most  evaluation 
research  focuses  on  average  outcomes,  probably  because  the  statistical  techniques  required 
to  estimate  effects  on  means  are  easier  to  use.  Many  econometric  models  also  implicitly 
restrict  treatment  effects  to  operate  in  the  form  of  a  simple  "location  shift" ,  in  which  case 
the  mean  effect  captures  the  impact  of  treatment  at  all  quantiles.  Of  course,  the  impact  of 
treatment  on  a  distribution  is  easy  to  assess  when  treatment  status  is  randomly  assigned 
and  there  is  perfect  compliance  with  treatment  assignment.  Randomization  guarantees 
that  outcomes  in  the  treatment  group  are  directly  comparable  to  outcomes  in  the  control 
group,  so  valid  causal  inferences  can  be  obtained  by  simply  comparing  the  treatment  and 
control  distributions.  The  problem  of  how  to  draw  inferences  about  distributional  effects 
in  randomized  studies  with  non-compliance  or  in  observational  studies  with  non-random 
assignment  is  more  difficult,  however,  and  has  received  less  attention.1 

In  this  paper,  we  show  how  to  use  a  source  of  exogenous  variation  in  treatment  status 


'Discussions  of  average  treatment  effects  include  Rubin  (1977),  Rosenbaum  and  Rubin  (1983),  and 
Heckman  and  Robb  (1985).  Manski  (1994),  Heckman,  Smith  and  Clements  (1997),  Imbens  and  Rubin 
(1997),  and  Abadie  (1999a)  discuss  effects  on  distributions.  Manski  (1994,  1997)  develops  estimators  for 
bounds  on  quantiles. 


-  an  instrumental  variable  -  to  estimate  the  effect  of  treatment  on  the  quantiles  of  the 
distribution  of  outcomes  in  non-randomized  studies,  or  in  situations  where  the  offer  of 
treatment  is  randomized  but  treatment  itself  is  voluntary.  This  Quantile  Treatment  Effects 
(QTE)  estimator  is  used  here  to  estimate  the  effect  of  training  on  trainees  served  by  the 
Job  Training  Partnership  Act  (JTPA)  of  1982,  a  large  publicly-funded  training  program 
designed  to  help  economically  disadvantaged  individuals.  The  data  come  from  the  National 
JTPA  Study,  a  social  experiment  begun  in  the  late  early  1980s  at  16  locations  across  the  US 
to  evaluate  the  effects  of  JTPA  training.  For  this  study,  JTPA  applicants  were  randomly 
assigned  to  treatment  and  control  groups.  Individuals  in  the  treatment  group  were  offered 
JTPA  training,  while  those  in  the  control  group  were  excluded  for  a  period  of  18  months. 
Only  60  percent  of  the  treatment  group  actually  received  training,  but  we  can  use  the 
treatment  assignment  as  an  instrument  for  treatment. 

The  treatment  effects  estimated  using  the  framework  developed  here  are  valid  for  a 
subpopulation  we  call  compilers.  This  terminology  is  used  because  in  randomized  trials 
with  partial  compliance,  like  the  JTPA,  the  relevant  subpopulation  consists  of  people  who 
always  comply  with  the  treatment  protocol.  In  fact,  in  the  case  of  the  JTPA,  where  (almost) 
no  one  in  the  control  group  received  treatment,  effects  for  compilers  are  also  representative 
of  effects  on  the  treated.2  In  other  cases,  compilers  are  those  whose  treatment  status  is 
affected  by  an  instrumental  variable. 

The  identification  results  underlying  the  compilers  approach  to  instrumental  variables 
(IV)  models  were  first  established  by  Imbens  and  Angrist  (1994)  and  Angrist,  Imbens,  and 
Rubin  (1996).  Imbens  and  Rubin  (1997)  extended  these  results  to  the  identification  of 
the  effect  of  treatment  on  distributions,  and  Abadie  (1999a)  showed  how  to  test  global 
hypotheses  about  distribution  impacts  such  as  stochastic  dominance.  But  neither  of  these 
papers  developed  simple  estimators  or  a  scheme  for  estimating  the  effect  of  treatment  on 
quantiles.  We  focus  here  on  conditional  quantiles  because  quantiles  provide  useful  summary 


2  Angrist  and  Imbens  (1991)  discuss  the  relationship  between  instrumental  variables  and  effects  on  the 
treated.  Orr,  et  al.  (1996)  and  Heckman,  Smith,  and  Taber  (1994)  report  average  effects  on  the  treated 
in  the  JTPA.  Heckman,  Clements  and  Smith  (1997)  estimate  the  distribution  of  JTPA  treatment  effects 
using  a  non-IV  framework. 


statistics  for  distributions,  and  because  quantile  comparisons  have  been  at  the  heart  of 
recent  discussions  of  changing  wage  inequality  (see,  e.g.,  Chamberlain  (1991),  Katz  and 
Murphy  (1992)  and  Buchinsky  (1994)). 

The  paper  is  organized  as  follows.  Section  2  outlines  the  conceptual  framework  and 
discusses  the  identification  problem.  Section  3  presents  the  estimator,  which  allows  for  a 
binary  endogenous  regressor  (indicating  exposure  to  treatment)  and  reduces  to  Koenker 
and  Bassett  (1978)  quantile  regression  when  selection  for  treatment  is  exogenous.  Like 
quantile  regression,  the  estimator  developed  here  can  be  written  as  the  solution  to  a  convex 
linear  programming  (LP)  problem,  although  implementation  of  the  QTE  estimator  requires 
estimation  of  a  nuisance  function  in  a  first  step.  Finally,  Section  4  discusses  the  estimates  of 
effects  of  training  on  the  quantiles  of  trainee  earnings.  The  estimates  for  women  show  larger 
proportional  increases  in  earnings  at  lower  quantiles  of  the  trainee  earnings  distribution. 
But  the  estimates  for  men  suggest  the  impact  of  training  was  largest  in  the  upper  half  of 
the  distribution  and  not  at  lower  quantiles  as  policy-makers  perhaps  would  have  wished. 

2.    Conceptual  Framework 

The  setup  is  as  follows.  The  data  consist  of  n  observations  on  a  continuously  distributed 
outcome  variable,  Y,  a  binary  treatment  indicator  D,  and  a  binary  instrument,  Z.  In  the 
case  of  subsidized  training,  Y  is  earnings,  D  indicates  program  participation,  and  Z  is  an 
indicator  of  the  randomized  offer  of  training.  Z  and  D  are  not  equal  in  the  JTPA  because 
not  everyone  who  was  offered  training  received  it  and  because  a  few  people  who  were  not 
offered  training  received  services  anyway.  In  a  study  of  the  effect  of  unions,  Y  might  be  a 
measure  of  wages,  D  would  indicate  union  status,  and  Z  would  be  an  instrument  for  union 
status,  say  a  dummy  indicating  individuals  who  work  in  firms  that  were  subject  to  union 
organizing  campaigns  (Lalonde,  Marschke  and  Troske  (1996)).  We  also  allow  for  an  r  x  1 
vector  of  covariates,  X. 

As  in  Rubin  (1974,  1977)  and  our  earlier  work  on  instrumental  variables  estimation 
of  causal  effects,  we  define  the  causal  effects  of  interest  using  potential  outcomes  and 


potential  treatment  status.  In  particular,  we  define  potential  outcomes  indexed  against 
D,  Yd,  and  potential  treatment  status  indexed  against  Z,  Dz.  Potential  outcomes  and 
potential  treatment  status  describe  possibly  counterfactual  states  of  the  world.  Thus,  D\ 
tells  us  what  value  D  would  take  if  Z  were  equal  to  1,  while  Do  tells  us  what  value  D  would 
take  if  Z  were  equal  to  0.  Similarly,  Yd  tells  us  what  someone's  outcome  would  be  if  they 
had  D  =  d.  The  objects  of  causal  inference  are  features  of  the  distribution  of  potential 
outcomes,  possibly  restricted  to  particular  subpopulations. 
The  observed  treatment  status  is: 

D  =  D0  +  (A  -  Do)  ■  Z. 

In  other  words,  if  Z  =  1,  then  D\  is  observed,  while  if  Z  =  0,  then  D0  is  observed.  Likewise, 
the  observed  outcome  variable  is: 

Y  =  Y0-D  +  Y1-(1-D).  (1) 

The  reason  why  causal  inference  is  difficult  is  that  although  we  think  of  all  possible  coun- 
terfactual outcomes  as  being  defined  for  everyone,  only  one  potential  treatment  status  and 
one  potential  outcome  are  ever  observed  for  any  one  person.3 

2.1.    Principal  Assumptions 

The  principal  assumptions  of  the  potential  outcomes  framework  for  IV  are  stated  below: 

Assumption  2.1:  For  almost  all  values  of  X, 

(i)  Independence;  (Yi,Y0,Di,D0)  is  jointly  independent  of  Z  given  X. 

(ii)  Non-Trivial  Assignment.-  P(Z  =  1\X)  e  (0, 1). 

(hi)  First-Stage:  £[Di|X]  ^  E[D0\X). 

(iv)  MONOTONICITY:  P(D1  >  D0\X)  =  1. 


3The  idea  of  potential  outcomes  appears  in  labor  economics  in  discussions  of  the  effects  of  union  status. 
See,  for  example,  Lewis'  (1986)  survey  of  research  on  union  relative  wage  effects. 


Assumption  2.1(i)  subsumes  two  related  requirements.  First,  comparisons  by  instru- 
ment status  identify  the  causal  effect  of  the  instrument.  This  is  equivalent  to  instrument- 
error  independence  in  traditional  simultaneous  equations  models.  Second,  potential  out- 
comes are  not  directly  affected  by  the  instrument.  This  is  an  exclusion  restriction.  See 
Angrist,  Imbens  and  Rubin  (1996)  for  additional  discussion  of  these  two  requirements  and 
how  they  differ.  Assumption  2.1(i)  is  plausible  (though  not  guaranteed)  in  the  case  of  the 
JTPA  because  of  the  randomly  assigned  offer  of  treatment. 

Assumption  2.1(h)  requires  that  the  conditional  distribution  of  the  instrument  not  be 
degenerate.  The  relationship  between  instruments  and  treatment  assignment  is  restricted 
in  two  other  ways  as  well.  As  in  simultaneous  equations  models,  we  require  that  there  be 
some  correlation  between  D  and  Z;  this  is  stated  in  Assumption  2.1  (hi).  Also,  Imbens 
and  Angrist  (1994)  have  shown  that  Assumption  2.1(iv)  guarantees  identification  of  a 
meaningful  average  treatment  effect  in  any  model  with  heterogeneous  potential  outcomes 
that  satisfies  assumptions  2.1(i)-2.1(iii).  This  monotonicity  assumption  means  that  the 
instrument  can  only  affect  D  in  one  direction.  Monotonicity  is  plausible  in  most  applications 
and  it  is  automatically  satisfied  by  latent-index  models  for  treatment  assignment.4  It  is 
also  a  reasonable  assumption  for  the  JTPA,  where  D0  =  0  for  (almost)  everyone. 

The  inference  problem  in  evaluation  research  involves  comparisons  of  observed  and  coun- 
terfactual  outcomes,  possibly  after  conditioning  on  observed  covariates,  X.  For  example, 
many  evaluation  studies  focus  on  estimating  the  difference  between  the  average  outcome 
for  the  treated  (which  is  observed)  and  what  this  average  would  have  been  in  the  absence 
of  treatment  (which  is  counter- factual).    Outside  of  a  randomized  trial,  the  difference  in 


4  A  latent-index  model  for  participation  is 

D=  1{A0  +  Z-A!  -  t?  >  0} 

where  Ao  and  Ai  are  parameters  and  77  is  an  error  term  that  is  independent  of  Z.  Then  Do  =  l{Ao  >  t]}, 
D\  —  l{Ao  +  Ax  >  77},  and  either  D\  >  Do  or  Dq  >  D\  for  everyone.  If  Ax  <  0  so  that  Dq  >  D\  for 
everyone,  then  monotonicity  holds  for  Z'  =  1  —  Z. 


average  outcomes  by  observed  treatment  status  is  typically  a  biased  estimate  of  this  effect: 
E[Y1\X,D  =  1]  -  E[YQ\X,D  =  0]  =  {E[Y1\X,D  =  1]  -  E[YQ\X,D=  1]} 

+  {E[Y0\X,  D  =  1]  -  E\Y0\X,  D  =  0}}. 
The  first  term  in  brackets  is  the  average  effect  of  the  treatment  on  the  treated,  which  can 
also  be  written  as  E\Y\  —  Yq\X,D  =  1]  since  expectation  is  a  linear  operator;  the  second 
is  the  bias  term.  For  example,  comparisons  of  earnings  by  training  status  are  biased  if 
trainees  are  selected  for  training  on  the  basis  of  low  earnings  potential.  This  bias  extends 
to  comparisons  other  than  the  mean.  For  example,  the  relationship  above  holds  if  we 
replace  conditional  expectations  with  conditional  quantiles. 

2.2.    Identification  Using  Instrumental  Variables 

An  instrumental  variable  solves  the  problem  of  identifying  causal  effects  for  a  group  of 
individuals  whose  treatment  status  is  affected  by  the  instrument.  The  following  result 
(Imbens  and  Angrist  (1994))  captures  this  idea  formally: 

Lemma  2.1:    Under  Assumption  2.1  (and  assuming  that  the  relevant  expectations  are  finite) 
E[Y\X,  Z  =  \\-  E[Y\X,  Z  =  0] 


E[D\X,  Z  =  1]  -  E[D\X,  Z  =  0] 


=  E[Y1-Y0\X,D1>D0}. 


E[YX  -Y0\X,Di  >  D0]  is  called  a  Local  Average  Treatment  Effect  (LATE).  We  refer 
to  individuals  for  whom  D\  >  Dq  as  compliers  because  in  a  randomized  trial  with  partial 
compliance,  this  group  would  consist  of  individuals  who  comply  with  the  treatment  protocol 
whatever  their  assignment.  In  other  words,  the  set  of  compliers  is  the  set  of  individuals 
whose  treatment  status  was  changed  in  the  experiment  induced  by  Z .  Note  that  individuals 
in  this  set  cannot  be  identified  (i.e.,  we  cannot  name  the  people  who  are  compliers)  because 
we  never  observe  both  D\  and  Do  for  any  one  person.  Also  note  that  in  the  special  case 
where  Do  =  0  for  everyone, 

E[Y1-Y0\X,Dl>DQ]    =    E[Yl-Y0\X,Dl  =  l]  =  E[Y1-Y0\X,D1  =  l,Z=l] 

=    E[Yl-Y0\X,D  =  l], 


so  LATE  is  the  effect  of  treatment  on  the  treated.  The  equivalence  between  effects  for 
compliers  and  effects  on  the  treated  in  cases  where  Do  is  identically  zero  holds  for  any 
distributional  characteristic  and  not  just  means. 

The  compliers  concept  is  at  the  heart  of  the  LATE  framework  and  provides  a  simple 
explanation  for  how  IV  methods  work.  Suppose  initially  that  we  could  know  who  the 
compliers  are.  For  these  people,  Z  —  D,  since  it  is  always  true  that  Dx  >  D0.  This 
observation  plus  Assumption  2.1  leads  to  the  following  lemma: 

Lemma  2.2:  Given  Assumption  2.1  and  conditional  on  X,  treatment  status,  D,  is  ignorable 
(independent  of  the  potential  outcomes)  for  compliers:  (Yi,  Yo)  -L  D\X,  Dx  >  DQ. 

Proof:  Assumptions  2.1(i)  says  that  (YUY0, Du D0)  ±  Z\X,  so  (Yi,Y0)  ±  Z\X,D1  = 
1,  D0  =  0.  When  Dx  —  1  and  D0  =  0,  D  can  be  substituted  for  Z.  D 

A  consequence  of  Lemma  2.2  is  that  in  the  subpopulation  of  compliers,  comparisons 
of  means  by  treatment  status  estimate  an  average  treatment  effect  even  though  treatment 
assignment  is  not  ignorable  in  the  population: 

E[Y\X,D  =  1,DX  >  Do]  -  E[Y\X:D  =  0,DX  >  D0]  =  E[Y,  -  Y0\X,D1  >  D0}.        (2) 

Of  course,  as  it  stands,  Lemma  2.2  is  of  no  practical  use  because  the  subpopulation  of 
compliers  is  not  identified  (i.e.,  we  do  not  observe  D\  and  Dq  for  the  same  individual).  To 
make  Lemma   2.2  operational,  we  define  the  following  function  of  D,  Z  and  X: 

_         D-(l-Z)       (l-D)-Z 

i-ttoPO        MX)    '  {S} 

where  n0(X)  =  P(Z  —  1\X).  Note  that  k  equals  one  when  D  =  Z,  otherwise  k  is  negative. 
This  function  is  useful  because  it  "identifies  compliers"  in  the  following  average  sense: 

Lemma  2.3:  (Abadie,  1999b)  Let  h(Y,D,X)  be  any  integrable  real  function  of  (Y,D,X). 
Then,  given  Assumption  2.1, 

E[h{Y,D,X)\Dx  >  Do]  =  p7p^~^)  '  E[k  ■  h(Y,  D,X)}. 


To  see  why  this  is  true,  note  that,  by  monotonicity,  the  population  can  be  partitioned 
into  three  groups:  compilers  who  have  D\  >  Dq,  always-takers  who  have  D\  =  Dq  =  1, 
and  never-takers  who  have  D\  =  Dq  =  0.  Thus, 

E[h(y,D,X)\X,Di>D0]    =    p(D^D^\X~){E[h{Y'D'X)lX] 

E[h{Y,D,X)\X~Di  =  D0  =  1]  •  P(A  =  D0  =  1|X) 
£[/i(F,D,X)|X,A  =  Z>0  =  0]  •  P(D1  =  D0  =  0\X)\. 


Monotonicity  means  that  all  individuals  with  Z  =  1  and  D  =  0  must  be  never-takers. 
Likewise,  those  with  Z  =  0  and  D  =  1  must  be  always-takers.  Since  Z  is  ignorable  given 
X,  we  have  the  following  expressions  for  always-takers  and  never-takers  as  a  function  of 
observed  moments: 

E[h(Y,DtX)\X,Di  =  A)  =  l]  =  E[h{Y,D,X)\X,D  =  1,Z  =  0] 

D-(l-Z) 


1  E 


P{D  =  1\X,Z  =  0) 


1  "  TTo(X) 


■h(Y,D,X) 


X 


E[h(Y,D,X)\X,D1  =  D0  =  0]  =  E[h(Y,  D,X)\X,  D  =  0,  Z  ==  1] 

1 


P(£  =  0|X,Z  =  1) 


P 


1~D)-      WAX) 


TTo(X) 


A' 


Monotonicity  and  ignorability  of  Z  given  X  can  similarly  be  used  to  identify  the  proportions 
of  always-takers  and  never-takers  using  P(D\  =  D0  =  1\X)  =  P(D  =  l\X,Z  =  0)  and 
P{D\  =  Dq  =  0\X)  =  P(D  =  0\X,  Z  —  1).  Integrating  over  X  completes  the  argument. 

An  implication  of  Lemma  2.3  is  that  any  parameter  defined  as  the  solution  to  a  moment 
condition  involving  (Y,D,X)  is  identified  for  compliers.  This  point  is  explored  in  detail  in 
Abadie  (1999b).5  In  the  next  section,  we  show  how  Lemma  2.3  can  be  used  to  develop  an 


5 


For  example,  if  we  define  fi  and  a  as 

(fi,  a)  =  argmin(ma)  E[(Y  -  m  -  aD)2\Di  >  D0], 

then,  fi  —  E\Yq\D\  >  Dq],  and  a  —  E\YX  -  Yq\D\  >  Do],  so  that  a  is  LATE  (although  fi  is  not  the 
same  intercept  that  is  identified  by  conventional  IV  methods).  By  Lemma  2.3,  (/u,a)  also  minimizes 
E[k  ■  (Y  -  m  -  aD)2}. 
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estimator  for  the  causal  effect  of  treatment  on  the  quantiles  of  an  outcome  variable. 

3.    Quantile  Treatment  Effects 
3.1.    The  QTE  Model 

The  QTE  estimator  is  based  on  a  model  where  the  effect  of  treatment  and  covariates  is 
linear  and  additive  at  each  quantile,  so  that  a  single  treatment  effect  is  estimated.  The 
analysis  is  straightforward  when  the  treatment  effect  varies  with  X,  but  we  use  an  additive 
model  because  the  resulting  estimator  simplifies  to  Koenker  and  Bassett  (1978)  quantile 
regression  when  there  is  no  instrumenting.  The  relationship  between  QTE  and  quantile 
regression  is  therefore  analogous  to  the  relationship  between  conventional  IV  and  ordinary 
least  squares  (OLS). 

The  parameters  of  interest  are  defined  as  follows: 

Assumption  3.1:  For  6  £  (0, 1),  there  exist  unique  ag  G  R  and  (3e  G  W  such  that 

Qe{Y\X,  D,  D1  >  D0)  =  aeD  +  X%.  (4) 

where  Qg(Y\X,D,Di  >  D0)  denotes  the  9-quantile  ofY  given  X  and  D  for  compliers. 

As  a  consequence  of  Lemma  2.2,  the  parameter  of  primary  interest  in  this  model,  ag, 
gives  the  difference  in  the  ^-quantiles  of  Y\  and  Yq  for  compliers.  This  tells  us,  for  example, 
whether  JTPA  training  changed  the  median  earnings  of  participants.  Note,  however,  that 
in  contrast  with  average  treatment  effects,  where  average  differences  equal  differences  in 
averages,  ag  is  not  the  quantile  of  the  difference  (Yi— Yq).  Although  the  latter  may  also  be  of 
interest,  we  focus  on  the  marginal  distributions  of  potential  outcomes  because  identification 
of  the  distribution  of  Y\  —  Yq  requires  much  stronger  assumptions  and  because  economists 
making  social  welfare  comparisons  typically  use  differences  in  distributions  and  not  the 
distribution  of  differences  for  this  purpose  (see,  e.g.,  Atkinson  (1970)). 6 


6Heckman,  Smith  and  Clements  (1997)  discuss  models  where  features  of  the  distribution  of  the  difference 
(Y\  —  Yq)  are  identified.  They  note  that  this  may  be  of  interest  for  questions  regarding  the  political 
economy  of  social  programs.  If  the  ranking  of  individuals  in  the  distribution  of  the  outcome  is  preserved 


The  model  above  differs  in  a  number  of  ways  from  the  model  in  the  seminal  papers 
by  Amemiya  (1982)  and  Powell  (1983),  who  used  least  absolute  deviations  to  estimate 
a  simultaneous  equations  system.  Their  approach  begins  with  a  traditional  simultaneous 
equations  model,  and  is  not  motivated  by  an  attempt  to  characterize  effects  on  distributions. 
Rather,  the  idea  is  to  improve  on  2SLS  when  the  distributions  of  the  error  terms  are  long- 
tailed.  Most  importantly,  in  contrast  with  the  parameters  in  equation  (4),  the  parameters 
of  interest  in  the  Amemiya/Powell  setup  do  not,  in  general,  define  a  conditional  quantile 
function.7 

The  parameters  of  the  conditional  quantile  function  in  equation  (4)  can  be  expressed 
as  (see  Bassett  and  Koenker  (1982)): 

{ae,Pe)  =  ai*gmm(a,/3)eM'-+1  E[Pe(Y  ~  aD  ~  x'P)\Di  >  A)], 

where  pe(X)  is  the  check  function,  defined  as  pe(A)  =  (9  —  1{A  <  0})  •  A  for  any  real  A. 
Therefore,  using  Lemma  2.3,  ae  and  fie  are  identified  as 

{ae,Pe)  =  argmin(Q/3)eMr+1  E[k  ■  pe{Y  -  aD  -  X'0)].  (5) 

This  population  objective  function  is  globally  convex  in  (ag,Pe)  since  it  is  equal  to  the 
check-function  minimand  for  compilers  times  some  positive  constant  {P{D\  >  Do))-  Fol- 
lowing the  analogy  principle  (Manski  (1988))  a  natural  estimator  of  (ag,/3g)  is  the  sample 
counterpart  of  (5).  However,  since  k  is  negative  when  D  is  not  equal  to  Z,  the  sample 
objective  function  turns  out  to  be  non-convex.  A  number  of  algorithms  exist  for  minimiza- 
tion problems  of  this  type  (piecewise  linear  and  non-convex  objective  functions),  but  they 
do  not  ensure  a  global  optimum  (see,  e.g.,  Charnes  and  Cooper  (1957)  or  Fitzenberger 
(1997a,b),  for  a  discussion  of  a  related  censored  quantile  regression  problem).  Unlike  the 


under  the  treatment,  then  the  estimator  in  this  paper  is  informative  about  the  distribution  of  treatment 
impacts.  King  (1983)  discusses  horizontal  equity  concerns  that  require  welfare  analyses  involving  the  joint 
distribution  of  outcomes. 

7  Identification  in  the  Amemiya  and  Powell  papers  comes  from  conditional  median  restrictions  on  the 
reduced  form.  However,  a  conditional  median  restriction  on  the  reduced  form  does  not  imply  that  the 
structural  equation  is  a  conditional  median.  In  fact,  for  a  binary  endogenous  regressor,  conditional  median 
restrictions  on  the  reduced  form  and  structural  equation  are  typically  incompatible. 
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conventional  quantile  regression  minimand,  the  sample  analog  of  equation  (5)  does  not  have 
a  linear  programming  representation. 

Now,  let  U  =  (Y,D,X);  applying  the  Law  of  Iterated  Expectations  to  equation  (5),  we 
obtain 

{<*e,Po)  =  argmin(Q)/3)er+i  E[kv  ■  pg(Y  -  aD  -  X'j3)],  (6) 

where 

D-(1-MU))     (i-D)-MU) 
kv  =  E[k\U\  =  1 — --^t —j — 

for  u0(U)  =  E[Z\U]  =  P(Z  =  1\Y,  D,  X).  Although  simple  to  derive,  this  second  represen- 
tation is  of  signal  importance  because,  as  we  show  below,  ku  is  a  conditional  probability 
and  is  therefore  non-negative. 

Lemma  3.1:    Under  Assumption  2.1,  ku(U)  =  P{DX  >  D0\U). 

Proof:  First  consider  the  product  D  ■  (1  —  Z).  This  differs  from  zero  only  if  Z  =  0  and 
Do  =  1.  By  monotonicity,  D0  =  1  implies  Dx  =  1.  Hence: 

E[D-(l-Z)\U]    =  P(D(1  -  Z)  =  1\U) 

=  P{D1  =  DQ  =  l\U)-P{Z  =  Q\D1^D0  =  l,U) 

=  P(A  =  D0  =  l\U)  ■  P(Z  =  0|A  =  D0  =  l,YltX) 

=  P(D1  =  D0  =  1\U)-P{Z  =  0\X). 

Similarly,  E[{1  -  D)  ■  Z\U]  =  P(D1  =  D0  =  0\U)  ■  P{Z  =  l\X).  Therefore, 

D(l-Z)  (l-D)Z 


nv{U)    =    E 


U 


P(Z  =  0\X)      P(Z  =  l\X) 
1  -  P(A  =  D0  =  l\U)  -  P{DX  =  D0  =  0\U)  ==  P{DX  >  D0\U). 

a 

A  consequence  of  this  lemma  is  that  it  is  possible  to  develop  a  QTE  estimator  with  an 
LP  representation  based  on  a  sample  analog  of  equation  (6)).   This  can  be  thought  of  as 
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a  scheme  to  "convexify"  the  sample  analog  of  (5).  The  resulting  convex  QTE  estimator 
minimizes  a  positively- weighted  check-function  minimand,  with  a  global  minimum  that  can 
be  obtained  as  the  solution  to  a  linear  programming  problem  in  a  finite  number  of  simplex 
iterations.  This  is  similar  in  spirit  to  Buchinsky  and  Hahirs  (1998)  LP-type  estimator  for 
censored  quantile  regression. 

An  interesting  question  is  whether  there  is  any  efficiency  cost  to  using  an  estimator 
based  on  ku  instead  of  the  sample  analog  of  (5).  In  fact,  it  can  be  shown  that  both  strate- 
gies produce  estimators  that  are  asymptotically  equivalent.8  In  light  of  this  result,  the 
rest  of  the  paper  focuses  on  estimation  by  minimizing  the  sample  analog  of  (6).  This  re- 
quires first-step  estimation  of  iro(X)  and  vQ(U)  to  construct  an  estimate  of  k.u(U),  denoted 
ku.  The  distribution  theory  is  developed  assuming  that  X  is  discrete,  so  a  saturated  lin- 
ear model  consistently  estimates  7r0(X).  We  use  series  approximation  to  estimate  vQ(U) 
non-parametrically  in  (X,  D)  cells.9  If  the  number  of  terms  in  the  series  approximation 
increases  at  an  appropriate  rate  with  the  sample  size,  this  procedure  ensures  that  the  esti- 
mated conditional  expectations  converge  to  the  true  conditional  expectations.  In  practice, 
however,  it  may  make  sense  to  use  something  less  than  a  saturated  model  for  X  when  the 
dimensionality  of  X  is  high.  We  discuss  practical  aspects  of  the  estimation  strategy  further 
when  the  results  are  presented  in  Section  4. 

3.2.    Estimation 

Assume  that  we  have  a  random  sample  {Y;,  Dz,  Xi,  Zi}™=1.  Let  W  =  (D,X')'  and  6g  — 
(ao,f3e)'  for  ag  and  j3e  in  equation  (4).  If  ku  were  known,  the  estimation  problem  would 
reduce  to  a  weighted  quantile  regression  problem  of  the  type  discussed  by  Newey  and  Powell 
(1990).   Since  ku  is  unknown,  we  estimate  this  function  nonparametrically  in  a  first  step 


8See  Newey  (1994);  the  estimators  are  asymptotically  equivalent  since  they  nonparametrically  estimate 
the  same  functional. 

9 Buchinsky  and  Hahn  (1998)  similarly  decompose  the  covariates  in  a  censored  quantile  regression  prob- 
lem into  a  set  of  discrete  variables  and  a  set  of  continuous  variables.  They  use  non-parametric  methods  to 
estimate  the  conditional  probability  of  censoring  in  cells  defined  by  the  discrete  variables. 
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and  use  the  fitted  values  k„([/i)  in  a  second  step  to  construct  the  estimator: 

1    n 
%  =  argmin,GRr+1  -  ]T  1{k„(^)  >  0}  •  «„(^)  •  pe(y,  -  W/5),  (7) 


77, 


First  step  estimation  of  kv  is  carried  out  using  non-parametric  series  regression.  For  an 
increasing  sequence  of  positive  integers  {A(fc)}^=1  and  a  positive  integer  K,  let  pK(Y)  = 
(Yx(lj,  ...,YX(K)).  Assume  that  X  only  takes  on  a  finite  number  of  values  (so  that  W  G 
{wi,...,wj}).  Then,  any  random  sample  {Vi}"=1  =  {(Z{,  Ul)}f=1  from  V  =  (Z,U)  can 
be  indexed  as  {{Vij}i3=1}j=1,  where  {Vij}iJ=1  are  subsequences  for  distinct  fixed  values  of 
(X,  D).  In  the  same  fashion,  the  sample  can  be  indexed  as  {{Vi^L^iLi,  where  {l^,}£'=i  are 
subsequences  for  distinct  fixed  values  of  X.  Now,  a  nonparametric  power  series  estimator 
v(U)  of  vq(U)  is  given  by  the  Least  Squares  projection  of  {Z,  }™J=1  on  {pK(Yi.)}™3=1  (this 
amounts  to  non-parametric  series  regression  of  Z  on  Y  in  each  W-cell).  Let  ut  be  the  fitted 
values  of  such  estimator  for  the  observations  in  our  sample.  Consider  the  simple  estimator 
n(X)  of  ir0(X)  obtained  by  averaging  Z  within  cells  of  X.  Our  first  step  estimator  of  ku  is 
given  by: 

~(TT\-1         ^-(1-%)         (l-A)-Pi 

K^)~1         l-rr{Xl)  tt(^)       • 

3.3.    Distribution  Theory 

This  subsection  summarizes  asymptotic  results  for  the  QTE  estimator.  Proofs  are  given  in 
the  appendix. 

Theorem  3.1:  Under  assumptions  2.1  and  3.1  and  if  (i)  the  data  are  i.i.d.;  (ii)  conditional 
on  W,  Y  is  continuously  distributed  with  support  equal  to  a  compact  interval  and  density 
bounded  away  from  zero;  (Hi)  7r0(X)  is  bounded  away  from  zero  and  one,  and  X  takes  on  a 
finite  number  of  values;  (iv)  conditional  on  W ,  eg  is  continuously  distributed  with  bounded 
density;  the  distribution  function  of  eg  conditional  on  W  and  D\  >  Do  is  continuously 
differentiate  at  zero  with  density  fee\w,Di>D0{ty  that  is  bounded  and  bounded  away  from  zero 
uniformly  in  W;  (v)  kv  is  bounded  away  from  zero  uniformly  in  U ;  (vi)  for  s  equal  to  the 
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number  of  continuous  derivatives  in  Y  ofu0,  n  ■  K  2s  — >  0  and  K5/n  — >  0.  Then,  nll2(6e  — 
6e)  ^  AA(0,O),  where  Q  =  J-'ZJ"1,  J  =  E[feg\w>Dl>Do(0)  ■  WW'\D,  >  D0]  ■  P(D1  >  D0) 
andZ  =  E[W]  with^  =  K-m(U)+H(X)-{Z-7r0(X)},  m{U)  =  {e-l{Y-W69  <  0})-W, 


H(X)  =  E 


m(U) 


D-(l-Z)        (1-D)-Z 


(I -MX))2        (MX))2 


X 


The  asymptotic  variance  formula  provided  by  this  theorem  is  robust  to  mis-specification 
of  the  functional  form  (in  Assumption  3.1).  In  such  a  case,  quantile  regression  estimates 
the  best  linear  predictor  under  asymmetric  loss.10 

To  produce  an  estimator  of  the  asymptotic  variance  matrix,  let 

...       1      (Y-W'8\  .  ...       1      (Yj  -  Wj6" 

MS)  =  -hv  {—h—)       and      <pUQ  =  ~h  v  {-^~ 

where  </?(•)  is  a  kernel  function.  Consider  the  following  estimator  of  J: 

J=-YJKv{Ui)-^(8e)-WiWi 


n 


1=1 


For  i  in  the  /-cell  of  X ,  let 


ni 


Hi  =  ~  |>  ~  Hn  -  W&  <  0})  •  Wk  ■  ({1  ~  Dk) '  Zil      Di'  "  (1  "  Z*'} 


WW 


;i  -  n(xt)y 


k(K)  =  i  - 


Dt  •  (1  -  Zi)      (1  -  Dt)  ■  Zr 


l-7f(Xi)  7?(Xi)  ' 

&  =  K{Vi)  ■  (6  -  1{Y  -  w(6e  <  0})  •  Wi  +  Ht  ■  {Zx  -  Srpfc)}. 


An  estimator  of  E  can  then  be  constructed  as 

n  t—' 

i=\ 

The  following  theorem  establishes  the  consistency  of  an  asymptotic  covariance  matrix  es- 
timator. 


10Most  of  the  literature  on  quantile  regression  treats  the  linear  model  as  a  literal  specification  for  condi- 
tional quantiles.  Alternately,  the  linear  model  can  be  viewed  as  an  approximation.  This  interpretation  is 
discussed  by  Buchinsky  (1991),  Chamberlain  (1991),  Fitzenberger  (1997),  and  Portnoy  (1991). 
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Theorem  3.2:  Under  the  assumptions  of  Theorem  3.1  and  (t)  h  — >  0,  nhA  — >  oo;  (ii) 
for  some  neighborhood  of  0,  f€g\w,Di>D0(-)  has  bounded  and  continuous  first  derivative; 
(Hi)  ip(z)  >  0,  J  (p(z)  dz  =  1,  J  \z  ■  tp(z)\dz  <  oo;  (iv)  there  exists  C  >  0  such  that 
\(p{z)  -  (p(zo)\  <C-  \z-zQ\.  Thenn  =  J~1EJ-1  A  Q. 

4.    Effects  of  Subsidized  Training 
4.1.    Background 

The  JTPA  began  funding  training  in  October  1983,  and  continued  to  fund  federally- 
sponsored  training  programs  into  the  late  1990s.  The  program  included  a  number  of  parts 
or  "titles",  the  largest  of  which  is  Title  II,  which  supports  training  for  those  judged  to  be 
economically  disadvantaged.  At  the  time  of  the  National  JTPA  Study  in  the  early  1990s, 
JTPA  Title  II  programs  were  serving  about  1  million  participants  a  year,  at  an  annual  cost 
of  roughly  1.6  billion  dollars.  JTPA  services  were  delivered  at  649  sites,  also  called  Service 
Delivery  Areas  (SDAs),  located  throughout  the  country. 

Title  II  of  the  JTPA  is  unusual  in  that  it  explicitly  incorporated  a  mandate  for  random- 
ized evaluation.11  The  National  JTPA  study  is  the  largest  randomized  training  evaluation 
ever  undertaken  in  the  US.  The  JTPA  evaluation  study  collected  data  on  about  20,000 
participants  at  16  SDAs.  These  sites  were  not  a  random  sample  of  all  SDAs;  rather,  they 
were  chosen  for  diversity,  willingness  and  ability  to  implement  the  experimental  design, 
and  the  size  and  composition  of  the  experimental  sample  they  could  provide.  Although 
the  non-random  selection  of  sites  raises  issues  of  external  validity  (as  in  many  clinical  tri- 
als), within  sites,  applicants  were  randomly  selected  for  JTPA  treatment.  The  evaluation 
sample  includes  applicants  who  applied  between  November  1987  and  September  1989. 

The  original  study  of  the  labor-market  impact  of  Title  II  services  was  based  on  15,981 
persons  for  whom  continuous  data  on  earnings  (from  either  State  unemployment  insurance 
(UI)  records  or  two  follow-up  surveys)  were  available  for  at  least  30  months  after  random 


11  Other  parts  of  the  JTPA,  such  as  Title  III  programs  for  workers  who  lost  their  jobs  as  a  consequence  of 
international  competition,  did  not  include  a  randomized  evaluation.  Background  for  this  section  is  drawn 
from  Orr,  et  al.  (1996),  Bloom,  et  al.  (1997),  and  the  US  Department  of  Labor  (1999)  website. 
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assignment.  Although  data  are  available  on  a  range  of  labor  market  outcomes  for  this 
sample,  we  focus  on  the  sum  of  earnings  in  this  30-month  period  since  this  is  probably  the 
best  measure  of  the  program's  lasting  economic  impact  on  participants.  Individuals  who 
were  not  offered  treatment  were  generally  excluded  from  receiving  JTPA  services  for  period 
of  18  months  following  their  application  (though  they  could  participate  in  other  programs 
at  any  time). 

The  JTPA  was  a  complicated  program  that  offered  a  wide  range  of  services.  JTPA 
service  providers  included  community  colleges,  State  employment  services,  community  or- 
ganizations, and  private-sector  training  agencies.  The  types  of  services  offered  can  be 
grouped  into  three  general  service  strategies.  These  strategies  are  (i)  classroom  training  in 
occupational  skills,  basic  education,  or  both;  (ii)  on-the-job  training  and/or  job  search  as- 
sistance (OJT/JSA);  (iii)  other  services  that  may  have  included  probationary  employment 
and/or  a  combination  of  the  first  two.  For  the  National  JTPA  Study,  service  strategies 
were  recommended  as  part  of  the  JTPA  intake  process,  before  random  assignment.  Al- 
though individuals  were  assigned  to  treatment  with  different  probabilities  depending  on 
their  SDA,  the  data  in  the  analysis  sample  were  artificially  balanced  to  maintain  a  2/1 
treatment-control  ratio  at  each  location. 

The  JTPA  offered  services  to  a  number  of  different  groups.  Title  II  applicants  were 
generally  deemed  eligible  for  training  if  they  faced  one  of  a  number  of  "barriers  to  em- 
ployment". These  included  long-term  use  of  welfare,  being  a  high  school  dropout,  15  or 
more  recent  weeks  of  unemployment,  limited  English  proficiency,  physical  or  mental  dis- 
ability, reading  proficiency  below  7th  grade  level,  or  an  arrest  record.  The  most  common 
barriers  were  unemployment  spells  and  high-school  dropout  status.  Applicants  were  cate- 
gorized as  being  in  one  of  five  groups:  adult  men,  adult  women,  female  youth,  male  youth 
non- arrestees,  and  male  youth  arrestees.  In  this  study  we  focus  on  adult  men  and  women 
because  the  samples  are  largest  for  these  two  groups.  There  are  6,102  adult  women  with 
30- month  earnings  data  and  5,102  adult  men  with  30-month  earnings  data. 
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4.2.    Average  Effects 

Using  our  earlier  notation,  Y  is  30- month  earnings,  D  indicates  those  who  were  recorded  as 
having  been  enrolled  for  JTPA  services,  and  Z  indicates  the  offer  of  services.  Although  the 
offer  of  treatment  was  randomly  assigned,  only  about  60  percent  of  those  offered  training 
actually  received  JTPA  services.  This  is  a  consequence  of  the  JTPA  evaluation  design, 
which  randomized  the  offer  of  services  early  in  the  application  process,  but  did  not  compel 
those  offered  services  to  participate  in  training. 

While  all  applicants  indicated  at  least  some  interest  in  receiving  JTPA  services,  those 
offered  treatment  were  not  necessarily  notified  immediately,  and  some  time  may  have  passed 
before  training  could  begin.  In  the  meantime,  applicants  selected  for  treatment  may  have 
found  jobs,  received  services  somewhere  else,  or  simply  lost  interest.  Providers  may  also 
have  had  an  incentive  to  delay  the  enrollment  of  applicants  that  they  thought  were  un- 
likely to  benefit  from  treatment,  while  some  SDAs  were  unable  to  find  service  providers  for 
some  applicants.  Also,  on  the  other  side  of  the  randomization  offer,  a  small  proportion  of 
those  selected  for  the  control  group  (1.6  percent)  received  JTPA  services  despite  the  exper- 
imenters' attempt  to  prevent  this.12  Treatment  status  is  therefore  likely  to  be  correlated 
with  potential  outcomes  and  cannot  be  treated  as  exogenous. 

Although  treatment  status  itself  was  not  randomly  assigned,  the  assumptions  of  our 
framework  appear  to  apply  in  this  case:  the  randomized  offer  of  treatment  is  likely  to  have 
been  independent  of  potential  outcomes,  the  offer  of  treatment  is  unlikely  to  have  affected 
outcomes  through  any  mechanism  other  than  treatment  itself,  and  denial  of  services  by 
randomization  is  not  likely  to  have  made  treatment  more  likely.  Moreover,  because  of  the 
very  low  probability  of  receiving  JTPA  services  in  the  control  (Z  —  0)  group,  effects  for 
compilers  in  this  case  can  also  be  interpreted  as  effects  on  those  who  were  treated. 

Since  training  offers  were  randomized  in  the  National  JTPA  Study,  covariates  (X)  are 
not  required  to  identify  training  effects.     Even  in  experiments  like  this,  however,  it  is 


12  Adult  men  and  women  who  were  offered  treatment  ultimately  received  about  150  more  hours  of  training 
services  than  were  received  by  the  members  of  the  control  group. 


17 


customary  to  control  for  covariates  to  correct  for  chance  associations  between  D  and  X  (as 
in  Orr,  et  al,  1996).  Moreover,  in  our  setup,  covariates  can  be  used  to  describe  earnings 
quantiles  for  compilers  in  population  subgroups,  since  we  estimate  Qg(Y\X,  D,  D\  >  D0). 
We  therefore  include  as  covariates  dummies  for  black  and  Hispanic  applicants,  a  dummy 
for  high-school  graduates  (including  GED  holders),  dummies  for  married  applicants,  5  age- 
group  dummies,  and  dummies  for  AFDC  receipt  (for  women)  and  whether  the  applicant 
worked  at  least  12  weeks  in  the  12  months  preceding  random  assignment.  Also  included  are 
dummies  for  the  original  recommended  service  strategy  (classroom,  OJT/JSA,  other)  and 
a  dummy  for  whether  earnings  data  are  from  the  second  follow-up  survey13  In  addition, 
the  analysis  is  carried  out  separately  for  men  and  women  since  previously  reported  results 
differed  by  sex. 

Descriptive  statistics  are  reported  in  Table  I.  There  are  more  minority  applicants  than 
in  the  general  population  and,  not  surprisingly  given  the  program  rules,  a  relatively  low 
proportion  of  high  school  graduates.  The  applicants  also  have  low  previous  employment 
rates.  Most  of  the  men  were  recommended  for  OJT/JSA  services,  while  the  women  were 
slightly  more  likely  to  be  recommended  for  classroom  training  than  OJT/JSA  or  other 
services.  Average  30-month  earnings  in  the  sample  are  about  $19,000  for  men  and  $13,000 
for  women. 

As  a  benchmark  for  the  purposes  of  comparison  with  earlier  analyses  of  the  JTPA, 
Table  II  reports  OLS  and  conventional  instrumental  variables  (2SLS)  estimates  of  the 
impact  of  training.14  The  first  column  reports  unadjusted  trainee/non-trainee  differences, 
while  the  OLS  estimates  in  column  (2)  are  from  a  regression  of  the  dependent  variable  on 
the  covariates  and  a  training  dummy  (D).  Without  the  use  of  covariates,  the  training/non- 
training  difference  is  $3,970  for  men  and  $2,133  for  women.  Trainee/nontrainee  differences 
are  precisely  measured  for  both  men  and  women.    OLS  estimates  of  training  effects  in 


13  The  covariate  information  comes  from  a  background  survey  conducted  as  part  of  the  JTPA  intake 
process.  The  covariate  list  used  here  is  similar  to  that  described  in  Appendix  B  of  Orr  et  al.  (1996),  except 
that  we  collapsed  some  categories  and  omitted  SDA  dummies  because  they  had  low  explanatory  power. 

14As  with  the  quantile  regression  and  QTE  estimates  discussed  later,  the  standard  errors  in  Table  II 
are  robust  in  the  sense  that  they  provide  consistent  estimates  of  the  asymptotic  variance  of  the  estimators 
under  general  misspecification. 
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models  with  covariates  are  similar  to  the  differences  without  controls. 

The  reduced-form  effects  of  the  offer  of  treatment  are  reported  in  columns  (3)  and  (4). 
Not  surprisingly,  since  Z  was  randomly  assigned  without  conditioning  on  covariates,  the 
estimates  with  and  without  covariates  differ  little.  Note  that  the  reduced  form  estimates 
are  not  directly  comparable  with  the  OLS  estimates  since  many  of  those  offered  training 
did  not  actually  receive  training. 

The  instrumental  variable  estimates  in  columns  (5)  and  (6)  of  Table  II  use  the  random- 
ized offer  of  treatment  (Z)  as  an  instrument  for  D  in  the  same  regression  as  was  used  to 
construct  the  estimates  in  columns  (1)  and  (2).  This  corrects  for  non-participation  among 
those  offered  training.  When  covariates  are  included,  the  2SLS  estimate  for  men  is  $1,593 
with  a  standard  error  of  $895,  less  than  half  the  size  of  the  corresponding  OLS  estimate. 
For  women,  however,  the  2SLS  estimate  is  $1,780  with  a  standard  error  of  $532,  not  dra- 
matically different  from  the  corresponding  OLS  estimate.  This  amounts  to  a  9  percent 
earnings  increase  for  men  and  a  15  percent  earnings  increase  for  women.  These  results  are 
similar  to  those  reported  in  previous  studies.15 

4.3.    Estimates  of  Quantile  Treatment  Effects 

Table  III  reports  OLS  and  conventional  quantile  regression  estimates  of  the  effect  of  train- 
ing. The  covariates  are  the  same  as  those  used  to  construct  the  estimates  in  Table  II.  The 
OLS  estimate  of  the  training  coefficient  is  $3,754  for  men  and  $2,215  for  women.  The  quan- 
tile regression  estimates  show  that  the  gap  in  quantiles  by  trainee  status  is  much  larger  (in 
proportionate  terms)  below  the  median  than  above  it.  For  men,  the  .85  quantile  of  trainee 
earnings  is  about  13  percent  higher  than  the  corresponding  quantile  for  non-trainees,  while 
the  .15  quantile  is  136  percent  higher.  For  women  the  difference  in  impact  across  quantiles 
is  less  dramatic,  but  still  marked.  Like  the  OLS  estimates  shown  in  the  table,  the  quantile 
regression  coefficients  do  not  necessarily  have  a  causal  interpretation.  Rather  they  provide 


15The  2SLS  estimates  in  Table  II  are  very  close  to  those  in  Table  4.6  of  the  National  JTPA  study  by  Orr 
et  al.  (1996).  The  estimates  are  not  identical  because  the  covariates  are  not  identical.  Percentage  effects 
were  computed  as  the  coefficient  on  training,  divided  by  fitted  values  with  the  training  dummy  set  to  zero 
and  other  covariates  set  to  means  for  the  treated. 
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a  descriptive  comparison  of  earnings  distributions  for  trainees  and  non-trainees. 

Implementation  of  the  QTE  estimator  requires  first  step  estimation  of  ku.  The  the- 
oretical results  in  the  previous  section  are  based  on  non-parametric  series  estimation  of 
the  conditional  expectations  in  k„,  but  this  leaves  open  a  range  of  possibilities.  Since  the 
elements  of  X  are  discrete,  non-parametric  estimation  of  E[Z\X]  is  in  principle  straightfor- 
ward. In  practice,  however,  a  fully  saturated  model  leads  to  problems  with  small  or  missing 
covariate  cells.  We  therefore  estimated  Z£[.Z|X]  using  the  restriction  that  because  of  ran- 
dom assignment,  Z  and  X  should  be  uncorrelated  in  large  samples.  The  resulting  estimate 
is  simply  the  empirical  E[Z\.  The  expectation  vq(U)  =  E[Z\Y,D,X]  was  estimated  using 
separate  models  for  D  =  0, 1.  Most  X's  were  dropped  because  they  had  little  explanatory 
value.  A  series  approximation  was  used  to  estimate  terms  in  Y.  Selection  of  the  order  for 
the  series  approximation  was  guided  by  cross-validation.  The  order  is  the  same  for  both 
values  of  D.16 

QTE  estimates  of  the  effect  of  training  on  median  earnings,  reported  in  Table  IV,  are 
similar  in  magnitude  though  less  precisely  estimated  than  the  2SLS  estimates  in  Table  II. 
As  noted  earlier,  for  women  the  2SLS  estimates  are  not  much  smaller  than  OLS  estimates, 
but  for  men  the  2SLS  estimates  are  considerably  smaller  than  OLS. 

A  particularly  interesting  finding  for  men  is  that  the  QTE  estimates  of  effects  on  quan- 
tiles  exhibit  a  pattern  very  different  from  the  quantile  regression  estimates.  In  particular, 
the  QTE  estimates  show  no  evidence  of  a  change  in  the  .15  or  .25  quantile.  The  estimates 
at  low  quantiles  are  substantially  smaller  than  the  corresponding  quantile  regression  esti- 
mates, and  they  are  small  in  absolute  terms.  For  example,  the  QTE  estimate  (standard 
error)  of  the  effect  on  the  .15  quantile  for  men  is  $121  (475),  while  the  corresponding  quan- 
tile regression  estimate  is  $1,187  (205).  Similarly,  the  QTE  estimate  (standard  error)  of  the 
effect  on  the  .25  quantile  for  men  is  $702  (670),  while  the  corresponding  quantile  regression 


lcHausman  and  Newey  (1995)  use  a  similar  approach  to  dimension-reduction  for  non-parametric  esti- 
mation of  consumer  demand  equations.  Given  estimates  of  k„,  we  computed  QTE  coefficient  estimates  by 
weighted  quantile  regression  using  the  Barrodale-Roberts  (1973)  linear  programming  algorithm  for  quan- 
tile regression  (see,  e.g.,  Koenker  and  D'Orey  (1987)).  A  biweight  kernel  was  used  for  the  estimation  of 
standard  errors. 
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estimate  is  $2,510  (356).  This  suggests  that  quantile  regression  estimates  of  training  effects 
at  low  quantiles  are  especially  distorted  by  positive  selection  on  earnings  potential.  It  seems 
that  training  did  not  really  change  the  lower  deciles  of  the  trainee  earnings  distribution  for 
men.  In  contrast  with  the  results  at  low  quantiles,  however,  the  QTE  estimates  of  effects  on 
male  earnings  above  the  median  are  large  and  statistically  significant  (though  still  smaller 
than  the  corresponding  quantile  regression  estimates). 

The  QTE  estimates  for  women  show  significant  effects  of  training  at  every  quantile, 
with  the  largest  proportional  effects  at  low  quantiles.  For  example,  training  is  estimated 
to  raise  the  .15  quantile  of  earnings  for  women  by  $324  (175),  an  increase  of  35  percent. 
The  estimates  also  suggest  training  raises  the  .85  quantile  by  $1,900  (997),  but  this  is  an 
increase  of  only  7  percent.  Most  of  the  QTE  estimates  for  women  are  reasonably  close  to 
the  corresponding  quantile  regression  estimates.  Thus,  whether  or  not  training  is  treated 
as  endogenous,  the  estimates  support  the  notion  that  for  women  training  had  a  bigger 
proportional  impact  on  the  lower  tail  of  the  earnings  distribution  than  the  upper  tail.  Of 
course,  women's  earnings  are  especially  low  in  this  sample,  so  large  proportional  effects  do 
not  translate  into  large  dollar  amounts.17 

Orr,  et  al.  (1996)  reported  effects  by  subgroups  but  found  no  clear  patterns.  They 
concluded  that  (p.  160)  "the  benefits  of  JTPA  are  broadly  distributed  across  a  wide  variety 
of  different  types  of  men  and  women."  Heckman,  Smith,  and  Clements  (1997)  similarly 
concluded  that  heterogeneity  of  impacts  is  important  but  that  most  women  benefited  from 
the  JTPA.  Our  results  do  not  contradict  these  general  conclusions,  but  they  nevertheless 
show  more  heterogeneity  in  program  effects  than  is  revealed  by  a  simple  analysis  within 
subgroups.  In  particular,  our  results  strongly  suggest  that  training  for  adult  women  had  a 
much  larger  proportional  effect  on  the  lower  tail  of  the  earnings  distribution  than  on  the 
upper  tail  (though  the  absolute  effect  on  the  lower  tail  is  small). 


17Interestingly,  QTE  estimates  of  the  proportional  effects  of  training  on  men  are  smaller  than  conven- 
tional quantile  regression  estimates  not  only  because  the  training  impact  is  lower,  but  also  because  the 
constant  is  bigger.  This  reflects  the  fact  that  the  constant  and  covariate-effects  estimated  by  QTE  are  for 
Qe{Yo\X,  D\  >  Do).  This  is  bigger  than  the  quantile  regression  intercept  because  of  positive  selection  for 
male  compilers. 
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Perhaps  most  striking  among  our  findings  is  the  result  that  training  for  adult  men  does 
not  seem  to  have  raised  the  lower  quantiles  of  their  earnings.  This  may  be  because  of 
an  effort  by  program  operators  to  target  services  at  relatively  easy-to-employ  men  with 
higher  earnings  potential.  The  results  in  distributional  changes  that  would  be  undesirable 
in  any  assessment  using  a  social  welfare  function  that  weights  the  lower  tail  of  the  earnings 
distribution  more  heavily.  Since  the  ostensible  purpose  of  the  JTPA  was  to  aid  economically 
disadvantaged  workers,  it  seems  likely  that  the  lower  quantiles  are  of  particular  concern  to 
policy  makers.  One  response  to  this  finding  might  be  that  few  JTPA  applicants  were  very 
well  off,  so  that  distributional  effects  within  applicants  are  of  less  concern  than  the  fact  that 
the  program  helped  many  applicants  overall.  However,  the  upper  quantiles  of  earnings  were 
reasonably  high  for  adult  males  who  participated  in  the  National  JTPA  Study.  Increasing 
this  upper  tail  is  therefore  unlikely  to  have  been  a  high  priority. 

5.    Summary  and  Conclusions 

This  paper  reports  estimates  of  the  effect  of  subsidized  training  on  the  quantiles  of  earn- 
ings for  participants.  We  use  a  new  estimator  for  the  effect  of  a  non-ignorable  treatment 
on  quantiles.  The  QTE  estimator  can  be  used  to  determine  how  an  intervention  affects 
the  distribution  of  any  variable  for  individuals  whose  treatment  status  is  changed  by  a 
binary  instrument.  The  estimator  accommodates  exogenous  covariates  and  collapses  to 
conventional  quantile  regression  when  the  treatment  is  exogenous.  It  minimizes  a  convex 
piecewise-linear  objective  function  similar  to  that  for  conventional  quantile  regression,  and 
can  be  computed  as  the  solution  to  a  linear  programming  problem  after  first-step  estima- 
tion of  a  nuisance  function.  The  paper  develops  distribution  theory  for  the  case  where 
this  first  step  is  estimated  nonparametrically.  QTE  estimates  of  the  effect  of  training  on 
the  quantiles  of  the  earnings  distribution  suggest  interesting  and  important  differences  in 
program  effects  at  different  quantiles,  and  differences  in  distributional  impact  for  men  and 
women.  These  differences  are  large  enough  to  potentially  change  the  welfare  analysis  of 
the  JTPA  program. 


22 


Appendix 

Proof  of  Theorem  3.1: 

This  proof  largely  follows  that  of  Theorem  1  in  Buchinski  and  Hahn  (1998).  Consider, 


Gn(T,K)  ='^2igi(j,K) 


where 


gi(r,  k)  =  K{Ui)  ■  {9  ■  \{eei  -  n-1'2  W[t)+  -  e+]  +  (1  -  6)  ■  [{eei  -  n~1'2  W(r)-  -  e«]}, 

and  egi  —  Y{  —  W-5g.    The  function  Gn(r,  1{kv  >  0}  •  kv)  is  convex  in  r  and  it  is  minimized  at  rn  = 
\/n(Se  -  5e)-  Now,  define  r„(r,K)  =  E[Gn(T,  k)}.  Note  that, 

MLLUA  =  _n-V2  W.  Kv{Ui)  .  {6  _  1{£0i  _  n-l/2  W,      <  Q}) 


almost  surely.  By  (iv)  and  Weierstrass  domination, 

r=0=  -n-1/2  E[Wkv(U)  ■  (9  -  l{ee  <  0})]  =  0, 


dE[g(T,Kl/)]  ,  _„-l/2 


dr 


d2E{g(r,  «„)] 


.  7^  [t=o=  n-1  E[ftolW:Dl>Do(0)  ■  WW'\DX  >  DQ]  ■  P(D1  >  £>„). 

OTOT 

Then, 

rn(r,  kv)  =  -r'Jr +  o(l), 

where  J  =  E[f£^WiDl>Do(0)  •  WW|£>i  >  £>0]  ■  P{Pi  >  A))-    Note  that  by  (ii)  and  since  both  kv  and 
feo\w,Di_>Do(fy  are  bounded  away  from  zero,  J  is  non-singular.  Define, 


Cn(Ui)  =  n-x'2{e  -  l{eei  <  0})  •  Wit 

n 

Un(K)=Y,K(Ui)-UUi), 
i=l 

and 

?„([/;, «,  r)  =  k(^)  •  {9  ■  [(eei  ~  n~1'2  W{t)+  -  e+]  +  (1  -  6)  ■  {(egi  -  n~1'2  W[t)~  -  e£]  +  r'U^)}. 
Note  that  E\un(Kv)\  =  0,  then 

Gn(j,n)       =       r„(T,K1/)-l-(G7l(r,K)-rn(r,K^)) 

=     r„(T,  k„)  -  T'u)n(K)  +  (G„(r,  k)  +  r'w„(K)  -  {rn(r,  k„)  +  /£[«„(«„)]}) 

n 

=       rn(r,  K„)  -  r'wn(«0  +  ^{PnC^i.  K'  T)  ~  -E'[/9n(t/-  K^>  T)]l 

i=l 

Lemma  A.l  : 

w„(l{Ky  >  0}  ■£„)  =n_1/2^Vi  +  oP(l)         with        £ty  =  0         and         £|M|2  <  oo. 


i=i 
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Lemma  A. 2  :  Eti(ft,(t[..l^>0}-ic„,T)-%(C/,K„1T)]}  =  op(l). 
Applying  Lemma  A. 2: 


2 


Gn(T,  l{Ku  >  0}  •  Ku)  =  o  *Vt  "  r'wn(l{^  >  0}  ■  K„)  +  Op(l) 


for  a  given  r.  Let  rjn  =  J   1cjn(l{Kl/  >  0}  •  k„).  Note  that: 

2  (T  ~  7?n)'  J(T  ~  7ln)   =   g  T'jT  _  T'W"(1{^  >  0}  ■  K„)  +  -  7^  J??n. 

Define  A„(r)  =  Gn(i~,  l{«t/  >  0}  ■  K„)  +  r'o;n(l{K„  >  0}  •  K„),  then  Xn(r)  =  ^  r'Jr  +  op(l).  Since  An(r)  is 
convex  in  r,  applying  Pollard's  convexity  lemma  (Pollard  (1991)): 


sup 


K(t)-\t'Jt 


-0, 


where  T  is  any  compact  subset  of  Er+1.  Then, 

G„(t,  1{k„  >  0}  ■  k„)  =  -  (t  -  ryn)'J(r  -  ??J  -  -  rfn  Jiln  +  rn(r) 

with  supTgT  |rn(r)|  =  Op(l)-  So,  by  Lemma  3  in  Buchinsky  and  Hahn  (1998),  we  have  that  rn  =  ?7n  +  op(l). 
Therefore,  by  Lemma  A.l 

n1/2(60-6e)^N(O,J-lZJ-1), 

where  £  =  E[ipip']. 

PROOF  OF  Lemma  A.l  :  To  prove  this  lemma  we  use  the  assumption  that  kv  is  bounded  away  from 
zero.  This  assumption  is  probably  stronger  than  necessary  but  it  allows  us  to  ignore  the  trimming  using 
\{kv  >  0},  making  the  asymptotics  easier.  Assumption  (vi)  implies  that,  K  ■  ((K/rij)1/2  +  K~s)  — >  0 
almost  surely  for  all  j  6  {1, ...,  J}.  Therefore  supt/eW  \v  —  vq\  =  op(l)  (see,  e.g.,  Newey  (1997),  Theorem 
4).  Since  txq  is  bounded  away  from  zero  and  one  (by  (iii)),  then  sup;yeW  \ku  —  ku\  =  op(l).  Since  /c„  is 
bounded  away  from  zero,  with  probability  approaching  one  the  trimming  is  not  binding  and  we  can  ignore 
it  for  the  asymptotics. 

ujn{Ku)  =  -=  y.m(Ui)  ■     1  - 7—- +  Rn 

Vn  ~(  V  1  -  7r0(Ai)  7T0(Ai)        / 

Let  7Tq  be  the  population  mean  of  Z  for  the  /-cell  of  X  and  ?   its  sample  counterpart. 
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o,  note  that 

111  fe            V 
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1    V^       ,m      f  &  ~  Dii)  ■  @ii  ~vOii)    ,      A,  ■(%,  -^Oi,)    ^ 

<  sup  \V  —  Vq\  ■ 

n'fe                V    *W         (l-5?')-(l 

-^)J 

Op(l). 


Then,  applying  Lemma  4.3  in  Newey  and  McFadden  (1994), 


I     -l 

lXln  -TV 


(i-A,)-^_    A.-(i-hh.)    \        (1) 

(1-^.(1-4)  J 


£ 


m([/) 


(!-£>)■  i/0        D-(l-i/0) 


Therefore, 


(7T0(X))2  (l-Tro(X))' 


A" 


m(C/) 


(l-D)-Z        D-(l-Z) 


(7r0(x)y      (i-MX)Y 


X 


i(Kv) 


lf.n    /      A-(l-gj)      (l-A)-gj 

"        V^  V  1-T0(^i)  TToCX) 

+       ^Vi/(Xi)-{^-7r0(XI)}+Op(l). 


To  prove, 


J_V-    an    A     A-(i-Pi)      (l-A)-gj 


notice  that 


V        t=l 


A  •  (l  -  vj)      (l  -  A)  ■  Pt 

l-Tro(Xi)    "         no(Xi) 


J       -.        ni 


A  •  (1  -  Zi)      (1  -  A)  ■  Zi 


"7=  >™  ^i     ■       1  - 7T7T ~    lv,  +  Op(l) 

VnJT(  V         l-TTo(Ai)  7r0(Ajj      / 


l  -  A,)  ■  Pjj 


tt0(A%) 
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So,  we  just  have  to  show  that  for  each  j  €  {1, ...,  J} 


(        Di.  -(1-Vi.)       (l-A)-zv 


i,=i 


:^m(^).     1-  , +0,(1 


l-Tro(^)  ttoP^.; 


,(!)■ 


This  will  be  done  by  checking  assumptions  6.1  to  6.6  in  Newey  (1994).  Assumptions  6.1  and  6.2  follow 
directly  from  the  conditions  of  the  theorem  (see  Newey~(1994),  page  1373).  Assumption  6.3  holds  with 
d  —  0  and  ad  =  s.  Assumption  6.4  holds  for  b(z)  —  0  and  derivative  equal  to 


m(U) 


Assumptions  6.5  and  6.6  follow  from:  (i)  rij  ■  K~2s  — >  0;  (ii)  K5/rij  — >  0  (almost  surely).  In  particular,  to 
check  Assumption  6.5  note  that  (vi)  implies  that  s  >  5/2,  therefore  K  ■  K~s  — >  0  (note  that  Assumption 
6.5  is  also  valid  with  d  —  0).  To  check  assumption  6.6  note  that  since 


m(U) 


then,  there  exists  a  sequence  £K  such  that 


D 


1-D 


i -MX)     MX) 


<  00, 


E 


m(U) 


D 


l~£)-tKPK(U) 


0 


l-no(X)       MX), 

as  K  -  ►  oo  (see  Newey  (1994),  page  1380  last  paragraph.)  Now,  applying  the  results  in  Newey  (1994), 
Di  ■  (1  -  Vi)      (1  -  Di)  ■  Vi 


)         MXi) 


1      V"       (TT\      (-L         A-(l-^Oi)         (1  ~  Di)  ■  VQ 

Tn^xmm\l-    I -MX,)    ~ 


l-A 
)     MXi) 


MXi) 

Z%  -  VQi)  +  op(l) 


V  2=1  N 


Di  ■  (1  -  ZO      (1  -  A)  ■  ^ 


7To(A'i 


TTo(A'i) 


+  Op(l) 


and  the  result  of  the  lemma  holds. 


Proof  of  Lemma  A. 2  :  Note  that  pn(Ui,l{Kv  >  0}-Kv,T)-pn(Ui,K1/,T)  =  (1{k„  >  0}-K„-nv)-Sn(Ui,T), 
where 

Sn(Ui,r)     =     9-[{eei-n-"2W'lT)+-el} 

+     (1-6).  [(ew  -  n"1/2  W!t)~  -  e,"]  +  r'UU), 

so  \Sn(Ui,r)\  <  n-xl2\{\e8i\  <  n-l'2\W[T\}  ■  \W[t\.  Also, 

E[n  ■  \Sn(Ui,r\}  <  n1'2  E[l{\ee\  <  jT^W'tW  ■  \W'r\] 

~Fu\w(n-W  \W'r\)  -  Feolw(-7i-V2  \W'r 


=  E 


-1/2 


\W't\ 


2-E[feolw(0)-\W'T\2}<™ 
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Then, 


^pn(Ui,l{Kv  >  0}  -KV,T)  -pn(Ui,Kv,T) 


2  =  1 


i=l 

1      - 

<     sup  \kv  -k„\  ■  -  ^n-  \Sn(Ui,T)\  =  op(l). 


ueu 


i=l 


Also,  by  cancellation  of  cross-product  terms, 


^2pn(Ui,Ku,r)  -E\pn{U,Kv,T)\ 


v»=l 


=     J2Ei(Pn(.U,^,r)f 


1=1 


<     £[l{|efli|  <  n-^l^/rl)  •  \W[t\2] 

-     0, 

and  the  result  of  the  lemma  holds. 

Proof  of  Theorem  3.2: 

Consistency  of  £  is  easy  to  prove  and  we  will  focus  on  J.  By  (ii)  and  (hi),  for  0  <  e*  <  eg: 

/*  1      fv-  W'6f>\ 
E[<Ph{60)\W,D1>D0]     =        i<P  {        h        J  /y|w,J31>A,(v) dV 


=  ifi(z)feo\W,D1>D0(h-z)dz 

=      fee\W,D1>Do(0)  +  h  ■        Z-<p(z)' 
=       fee\W,Dl>Do{0)  +  O(h). 


dz 


dz 


(A.1) 


Therefore, 


lim iE[<ph(6e)\  W,D1  >  D0]  =  U]W,Di>do(0)- 

a — >U 


By  equation  (A.l)  and  condition  (iv)  in  Theorem  3.1,  E[<ph(6o)\W,Di  >  Do]  is  eventually  bounded  (in 
absolute  value)  by  a  constant.  Since  W  is  also  bounded,  we  have  that 

lim  E  [kv  ■  <ph(6e)  ■  WW'}     =     hm  E  [E  [<ph{6g)\W,  Dx  >  D0]  ■  WW'\DX  >  D0]  ■  P(Dl  >  D0) 

h — *0  h — >Q 

=    E  [feelw,Dl>Do(0)  ■  WW'\DX  >  Do]  ■  P(D1  >  D0). 
Also,  since  kv,  </?(•)  and  W  are  bounded, 

var(Ku-^h(6e)-WW')     =     0(l/h2). 
Since  n  ■  h?  — >  oo,  then 

^  5>„(tfi)  ■  <phii(6e)  ■  WiW[  1  E  [U\w,Dl>Do(0)  ■  WW'ID,  >  D0]  ■  P(D1  >  D0).  (A.2) 

Notice  that,  since  k„  is  bounded  away  from  zero  (uniformly  in  U), 

i  X>„(0i)  •  <phi3e)  ■  WiWl  -  Kv{Ui)  ■  <phA(8e)  ■  W{W[ 

1    n   n  ~  1    "  ~ 

<  C  ■  sup  \kv  -  kv\  ■  -  V"  \\kv  ■  fh  i(Se)  ■  WiW-    =  C  •  sup  |re„  -  k„\  ■  -  V"^  •  fh,i(Se 


i=l 


ueu 


ueu 


i=l 
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Under  the  assumptions  of  Theorem  3.1,  sup,yeW  \ku  —  kv\  — >  0.  In  addition, 

1    -  ~  1    n  I  - 


i=l 


2=1 

*     C-^-h±-h¥e-8e\\.\m\ 

i=l 


<     C  •  n1/2  ||?e  -8e\\-  (n1/^2)-1  =  op(l). 
As  shown  above, 

'  i=i 
Therefore, 

£  f>„(0i)  ■  ¥>m&)  •  WiW!  =  lY,Kv(Ui)  ■  <ph>l(6e)  ■  WiWi  +  o„(l). 

2  =  1 

By  (i)  and  (iv),  for  some  constant  C 

I  Y,Kv{Ui)  ■  ipKi(6e)  ■  WiWi  -  Kv{Ui)  ■  <phii(6e)  ■  WtW[ 


2=1 


2=1 


(A.3) 


<  C  •  n1/2  \\6e  -Se\\-  {nl'2h2)-1  =  op{\). 


(A.4) 


Combining  equations  (A. 2),  (A.3)  and  (A.4),  we  get  J  A  J. 
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Table  I 
Means  and  Standard  Deviations 


Entire 
Sample 

Assign] 

nent 

Treatment 

Control 

Difference 

(t-stat.) 

A.  Men 

Number  of  observations 

5,102 

3,399 

1,703 

Baseline  Characteristics 

Age 

32.91 
[9.46] 

32.85 

[9.46] 

33.04 

[9.45] 

-.19 

(-67) 

High  school  or  GED 

.69 

[.45] 

.69 

[.45] 

.69 

[.45] 

-.00 
(-.12) 

Married 

.35 

[.47] 

.36 

[.47] 

.34 

[.46] 

.02 
(1.64) 

Black 

.25 

[.44] 

.25 

[.44] 

.25 

[.44] 

.00 
(.04) 

Hispanic 

.10 
[.30] 

.10 
[.30] 

.09 
[.29] 

.01 
(.70) 

Worked  less  than  13 
weeks  in  past  year 

.40 

[.47] 

.40 

[.47] 

.40 

[.47] 

.00 
(.56) 

Experimental  Characteristics 

Second  follow-up 

.29 
[.46] 

.30 
[.46] 

.28 

[.45] 

.02 
(1.14) 

Training 

.42 

[.49] 

.62 

[.48] 

.01 
[.11] 

.61 
(70.34) 

Service  strategy: 

Classroom  training 

.20 
[.40] 

.21 
[.41] 

.19 
[.39] 

.02 

(1.73) 

OJT/JSA 

.50 

[.50] 

.50 

[.50] 

.50 
[.50] 

.00 

(.07) 

Other 

.29 
[.46] 

.29 
[.45] 

.31 

[.46] 

-.02 

(-1.57) 

Outcome  variable: 

30  month  earnings 

19,147 
[19,540] 

19,520 
[19,912] 

18,404 
[18,760] 

1,116 
(1.96) 
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continued  from  previous  page 


Entire 

Assigm 

nent 

Treatment 

Control 

Difference 

Sample 

(t-stat.) 

B.  Women 

Number  of  observations 

6,102 

4,088 

2,014 

Baseline  Characteristics 

Age 

33.33 
[9.78] 

33.33 
[9.77] 

33.35 
[9.81] 

-.02 
(-.09) 

High  school  or  GED 

.72 

[.43] 

.73 

[.43] 

.70 
[.44] 

.03 
(2.01) 

Married 

.22 

[.40] 

.22 

[.40] 

.21 
[.39] 

.01 
(1.55) 

Black 

.26 
[.44] 

.27 
[.44] 

.26 

[.44] 

.01 
(.95) 

Hispanic 

.12 

[.32] 

.12 
[.32] 

.12 

[.33] 

-.00 
(-.89) 

Worked  less  than  13 
weeks  in  past  year 

.52 
[.47] 

.52 

[.47] 

.52 

[.47] 

-.00 
(-.08) 

AFDC 

.31 
[.46] 

.30 

[.46] 

.31 

[.46] 

-.01 
(-1.03) 

Experimental  Characteristics 

Second  follow-up 

.26 
[.44] 

.26 
[.44] 

.25 
[.43] 

.01 
(.45) 

Training 

.45 

[.50] 

.66 

[.47] 

.02 
[.13] 

.64 
(80.24) 

Service  strategy: 

Classroom  training 

.38 
[.49] 

.38 

[.49] 

.39 

[.49] 

-.01 
(-37) 

OJT/JSA 

.37 
[.48] 

.37 
[.48] 

.38 

[.49] 

-.01 

(-.85) 

Other 

.24 

[.43] 

.25 

[.43] 

.23 

[.42] 

.02 
(1.40) 

Outcome  variable: 

30  month  earnings 

13,029 
[13,415] 

13,439 
[13,614] 

12,197 
[12,964] 

1,242 
(3.46) 

Note:  The  first  three  columns  of  the  table  report  means  and  standard  deviations  (in  brackets) 
for  the  National  JTPA  Study  30-month  earnings  sample.  The  last  column  shows  the  difference  in 
means  by  assignment  status  and  reports  the  t-statistic  (in  parenthesis)  for  the  null  hypothesis  of 
equality  in  means. 


33 


H 
en 

H 


CO 

O 


J3 

nj 

t~ 

tS 

CO 

> 

0) 

"fl 

rf 

6 

£  w 


C  "3 

O  *. 

.2  c 

t-  a; 

ti  c 

S  « 

c  to 

o  *5S 

Q  u 


O    ^ 


O  co 


O  I 


O    CN 


to  o> 
t  oo 


o  _ 


o  -— ~- 

h-     CO 

-  in 


,-H    to 

-  io 


t--.   CO 


t—    iO  CO    U3 

„  m  r  CO 


E 

a 
o 

a) 

3 

u 

a 

"o 
O 

ft 

CD 

s 

o 

w 

£ 

3 
"c6 

CD 
t-i 

a 

u 

CD 

CO 

^j 

tO 

CD 

C 

+J 

CD 

~ 

<e 

g 

q 

= 

e& 

W 

9  B  & 


-a  -s  15 

I     »■? 
^     3     c 

ca    ~  ■« 

■T  *j  ^ 
*—"  rt    o, 

■a     t.        r. 

*  15  § 

2.*  § 


3™  ,", 

o  •=  O 

°|q 

S-^S.2 

W  C3     +^ 

tp    C     ft    £ 
C     CD  (_ 

bE»| 

*■  V>  a  ■" 


.S  3 


5  <„  b  °> 

■=   o  a   ? 

■*— •    W  rT  =! 

0>  ■«     O 

H    OJ  a  t3 


a) 


cL-e^ 


1  8-S-o 

s  --5  ^ 


34 


Table  III 
Quantile  Regression  and  OLS  Estimates 


Dependent  variable:  30-month  earnings 


OLS 

Quantile 

0.15 

0.25 

0.50 

0.75 

0.85 

A.  Men 

Training 

3,754 
(536) 

1,187 
(205) 

2,510 
(356) 

4,420 
(651) 

4,678 
(937) 

4,806 
(1,055) 

%  Impact  of  Training 

21.20 

135.56 

75.20 

34.50 

17.24 

13.43 

High  school  or  GED 

4,015 
(571) 

339 
(186) 

1,280 
(305) 

3,665 
(618) 

6,045 
(1,029) 

6,224 

(1,170) 

Black 

-2,354 
(626) 

-134 
(194) 

-500 
(324) 

-2,084 
(684) 

-3,576 
(1087) 

-3,609 
(1,331) 

Hispanic 

251 
(883) 

91 
(315) 

278 
(512) 

925 
(1,066) 

-877 
(1,769) 

-85 
(2,047) 

Married 

6,546 
(629) 

587 
(222) 

1,964 
(427) 

7,113 
(839) 

10,073 

(1,046) 

11,062 
(1,093) 

Worked  less  than  13 
weeks  in  past  year 

-6,582 
(566) 

-1,090 
(190) 

-3,097 
(339) 

-7,610 
(665) 

-9,834 
(1,000) 

-9,951 
(1,099) 

Constant 

9,811 

(1,541) 

-216 
(468) 

365 
(765) 

6,110 
(1,403) 

14,874 
(2,134) 

21,527 
(3,896) 

B.  Women 

Training 

2,215 
(334) 

367 
(105) 

1,013 
(170) 

2,707 
(425) 

2,729 
(578) 

2,058 
(657) 

%  Impact  of  Training 

18.46 

60.76 

44.42 

32.25 

14.47 

8.09 

High  school  or  GED 

3,442 
(341) 

166 
(99) 

681 
(156) 

2,514 
(396) 

5,778 
(606) 

6,373 
(762) 

Black 

-544 
(397) 

22 
(115) 

-60 
(188) 

-129 
(451) 

-866 

(679) 

-1,446 
(869) 

Hispanic 

-1,151 
(488) 

-31 
(130) 

-222 
(194) 

-995 
(546) 

-1,620 

(911) 

-1,503 
(992) 

Married 

-667 
(436) 

-213 
(127) 

-392 

(209) 

-758 
(522) 

-1,048 
(785) 

-902 

(970) 

Worked  less  than  13 
weeks  in  past  year 

-5,313 
(370) 

-1,050 
(137) 

-3,240 
(289) 

-6,872 
(522) 

-7,670 
(672) 

-6,470 
(787) 

AFDC 

-3,009 
(378) 

-398 

(107) 

-1,047 
(174) 

-3,389 
(468) 

-4,334 
(737) 

-3,875 
(834) 

.  Constant 

10,361 
(815) 

649 
(255) 

2,633 
(490) 

8,417 
(966) 

16,498 
(1,554) 

20,689 
(1,232) 

Note:  The  table  reports  OLS  and  quantile  regression  estimates  of  the  effect  of  training  on  earnings. 
The  specification  used  also  includes  indicators  for  service  strategy  recommended,  age  group  and  second 
follow-up  survey.  Robust  standard  errors  in  parenthesis. 
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Table  IV 
Quantile  Treatment  Effects  and  2SLS  Estimates 


Dependent  variable:  30-month  earnings 


2SLS  Quantile 


0.15  0.25  0.50  0.75  0.85 


A.  Men 
Training 

%  Impact  of  Training 

High  school  or  GED 

Black 

Hispanic 

Married 

Worked  less  than  13 
weeks  in  past  year 

Constant 


1,593 

— vzt 

702 

1,544 

3,131 

3,378 

(895) 

(475) 

(670) 

(1,073) 

(1,376) 

(1,811) 

8.55 

5.19 

11.99 

9.64 

10.69 

9.02 

4,075 

714 

1,752 

4,024 

5,392 

5,954 

(573) 

(429) 

(644) 

(940) 

(1,441) 

(1,783) 

-2,349 

-171 

-377 

-2,656 

-4,182 

-3,523 

(625) 

(439) 

(626) 

(1,136) 

(1,587) 

(1,867) 

335 

328 

1,476 

1,499 

379 

1,023 

(888) 

(757) 

(1,128) 

(1,390) 

(2,294) 

(2,427) 

6,647 

1,564 

3,190 

7,683 

9,509 

10,185 

(627) 

(596) 

(865) 

(1,202) 

(1,430) 

(1,525) 

-6,575 

-1,932 

-4,195 

-7,009 

-9,289 

-9,078 

(567) 

(442) 

(664) 

(1,040) 

(1,420) 

(1,596) 

10,641 

-134 

1,049 

7,689 

14,901 

22,412 

(1,569) 

(1,116) 

(1,655) 

(2,361) 

(3,292) 

(7,655) 

B.  Women 
Training 

%  Impact  of  Training 

High  school  or  GED 

Black 

Hispanic 

Married 

Worked  less  than  13 
weeks  in  past  year 

AFDC 
Constant 


1,780 
(532) 

324 

(175) 

680 
(282) 

1,742 
(645) 

1,984 
(945) 

1,900 
(997) 

14.60 

35.47 

23.14 

18.37 

10.06 

7.39 

3,470 
(342) 

262 

(178) 

768 
(274) 

2,955 
(643) 

5,518 
(930) 

5,905 

(1026) 

-554 
(397) 

0 
(204) 

-123 
(318) 

-401 
(724) 

-1,423 

(949) 

-2,119 
(1,196) 

-1,145 
(488) 

-73 
(217) 

-138 
(315) 

-1,256 
(854) 

-1,762 
(1,188) 

-1,707 
(1,172) 

-652 
(437) 

-233 
(221) 

-532 
(352) 

-796 
(846) 

38 
(1,069) 

-109 

(1,147) 

-5,329 

(370) 

-1,320 
(254) 

-3,516 
(430) 

-6,524 
(781) 

-6,608 
(931) 

-5,698 

(969) 

-2,997 
(378) 

-406 
(189) 

-1,240 
(301) 

-3,298 
(743) 

-3,790 
(1,014) 

-2,888 
(1,083) 

10,538 
(828) 

984 
(547) 

3,541 
(837) 

9,928 
(1,696) 

15,345 
(2,387) 

20,520 
(1,687) 

Note:  The  table  reports  2SLS  and  QTE  estimates  of  the  effect  of  training  on  earnings.  Assignment 
status  is  used  as  an  instrument  for  training.  The  specification  used  also  includes  indicators  for  service 
strategy  recommended,  age  group  and  second  follow-up  survey.  Robust  standard  errors  in  parenthesis. 
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